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As early indicated by Charles Darwin, languages behave and change very much like living 
species. They display high diversity, differentiate in space and time, emerge and disappear. A 
large body of literature has explored the role of information exchanges and commimicativc 
constraints in groups of agents under selective scenarios. These models have been very 
helpful in providing a rationale on how complex forms of communication emerge \mdcr 
evolutionary pressures. However, other patterns of large-scale organization can be described 
using mathematical methods ignoring communicative traits. These approaches consider shorter 
time scales and have been developed by exploiting both theoretical ecology and statistical 
physics methods. The models are reviewed here and include extinction, invasion, origination, 
spatial organization, coexistence and diversity as key concepts and are very simple in their 
defining rules. Such simplicity is used in order to catch the most fundamental laws of 
organization and those universal ingredients responsible for qualitative traits. The similarities 
between observed and predicted patterns indicate that an ecological theory of language is 
emerging, supporting (on a quantitative basis) its ecological nature, although key differences 
are also present. Here we critically review some recent advances lying and outline their 
implications and limitations as well as open problems for future research. 
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1. Introduction 

Languages and species share some remarkable common- 
alities. Such similarities did not escape from the atten- 
tion of Charles Darwin, who mentioned them a number 
of times in writings and letters (see Whitfield, 2008). In 
The Descent of Man (Darwin, 1871) he explicitely says: 

The formation of different languages and of 
distinct species, and the proofs that both have 
been developed through a gradual process, are 
curiously parallel 

Languages indeed behave as some kind of living 
species (Mufwene 2001; Pagel 2009). They exhibit a 
large diversity: it is estimated that around 6000 different 
languages exist today in our modern world (Krauss, 
1992; Nettle and Romaine, 2000; McWorther, 2001). 
Languages and genes are known to be correlated at 
both global (Cavalli-Sforza et al. 1988; Cavalli-Sforza, 
2000) and local (see Lansing et al., 2007 and references 
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therein) population scales. As it occurs with biodi- 
versity estimates too. the actual language diversity is 
unknown, and estimates fluctuate up to around 10000 
different spoken languages. Needless to say, another 
clement to consider is the internal diversity displayed by 
languages themselves, where -like subspecies- dialects 
abound. 

Languages also display geographical variation: as 
it occurs with species, they become more and more 
different under the presence of physical barriers. They 
come to life, as species appear by speciation. They 
also get extinct, and language extinction has become 
a major problem to our cultural heritage: as it occurs 
with endangered species, many languages are also on 
the verge of disappearance (Crystal, 2000; Sutherland, 
2003; Dalby 2003; Mufwene, 2004). Languages die with 
their last speaker: Crystal mentions the example of Ole 
Stig Andersen, a researcher looking in 1992 for the last 
speaker of the West Caucasian language Ubuh. In the 
words of Andersen: 

(The Ubuh) ... died at day break, October 
8th 1992, when the last speaker, Tevfik Eseng, 
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passed away. I happened to arrive in his village 
that very same day, without appointment, to 
interview the Last Speaker, only to learn that 
he had died just a couple of hours earlier. 

This story dramatically illustrates the last breath of any 
extinct language. It dies as soon as its last speaker dies 
(or stops using it). It is also interesting to observe that 
the extinction risk and its correlation with geographical 
distribution is shared by both species and languages 
(Sutherland, 2003). 

Language change involves both evohitionary and 
ecological time scales. Most theoretical studies deal 
with large-scale evolution: how languages emerge and 
become shaped by natural selection (Hawkins and Gell- 
Mann 1992; Nowak and Krakauer, 1994; Deacon 1997; 
Parisi 1997; Cangelosi and Parisi 1998; Pinker 2000; 
Cangelosi 2001; Kirby 2002; Hauser et al. 2002; Wray 
2002; Brighton et al. 2005; Kosmidis et al., 2005, 2006; 
Baxter et al., 2006; Szamado and Szathmary 2006; 
Oudeyer and Kaplan 2007; Floreano et al., 2007; Lipson 
2007; Christiansen and Chater 2008; Chater et al., 2009; 
Nolfi and Mirolli 2010). But languages also display 
changes within the short time scale of one or a few 
human generations. Actually, a great deal of what will 
happen to languages in the future is deeply related to 
their ecological nature. Demographic growth, the dom- 
inant role of cities in social and economic organization 
and globalization dynamics will largely shape world's 
languages (Graddol, 2004). 

Languages evolve under centuries of accumulated 
modifications (this is well illustrated by written texts, 
see Howe et al., 2001, Bennett et al, 2003) and undergo 
evolutionary bursts (Atkinson et al., 2008). On short 
time scales they can be described in terms of ecologi- 
cal systems. These rapid modifications affect language 
diversity, their internal differentiation and even their 
survival. Different studies using the perspective of sta- 
tistical physics (Nettle 1999a-c; Benedetto et al., 2002; 
Stauffer and Schulze, 2005; Wang and Minett, 2005; 
Ke et al., 2002, 2008; Loreto and Steels, 2007; Zanette, 
2008; de Oliveira et al., 2008) have been able to cope 
with these phenomena, showing that the basic trends of 
language dynamics share remarkable similarities with 
the spatiotemporal behavior of complex ecosystems. 

We will consider different levels of language orga- 
nization, from words to languages as abstract entities. 
The models reviewed here explore the conditions under 
which words or languages can survive or disappear. The 
time scale is ecological; therefore we assume that in 
short time scales the dynamics of change does not affect 
the structure of language itself and thus evolutionary 
models are not considered. Moreover, we do not intend 
to quantitatively reproduce observed patterns, although 
the predictions of the models can be tested in many 
cases from real data. Instead, the models we revise try 
to capture the logic of the underlying processes in a 
qualitative fashion. These models follow the spirit of 
statistical physics in trying to reduce system's com- 
plexity to its bare bones. They provide a powerful 
approximation that allow us to see global patterns 
that might not depend on the intrinsic nature of the 
components involved. They also help highlighting the 



differences. As will be discussed below, languages also 
exhibit marked departures from ecological traits. 

This review critically examines a set of models of 
increasing complexity. Specifically, we review recent 
advances within the fields of statistical physics and 
theoretical ecology relative to a better understanding 
of language dynamics. We begin with a very simple 
model describing word propagation within a popula- 
tion. Next, the effects and consequences of competition 
among linguistic variants, with special attention to 
those scenarios leading to language extinction. This is 
expanded by considering alternative scenarios allowing 
language coexistence to occur, either through bilin- 
gualism or spatial and social seggregation. Although 
spatial coexistence under local competition is shared 
with ecosystems, bilingualism belongs to a different 
class of phenomenon. All these models involve a small 
number of interacting languages. The final part of the 
review deals with language diversity in space and time. 
Both a simple model of multilingual communities and 
available data on scaling laws in language diversity are 
presented. Once again, striking similarities and strong 
differences are found. A synthesis of these ideas and 
open problems is presented at the end, together with a 
table comparing language and ecosystem's properties. 

2. Lexical difFusion 

The potential set of words used by a speakers commu- 
nity is listed in dictionaries (Miller, 1991). They capture 
a given time snapshot of the available vocabulary, but 
in reality speakers only use part of the possible words: 
many are technical and thus only used by a given 
group and many are seldom used. Many words are 
actually extinct, since no one is using them. On the 
other hand, it is also true that dictionaries do not 
include all words used by the commimity and also that 
new words are likely to be created constantly within 
populations and their origins have been sometimes 
recorded (Chantrell, 2002) . Many of them are new uses 
of previous words or recombinations and sometimes 
they come from technology. One of the challenges of 
current theories of language dynamics is understanding 
how words originate, change and spread within and 
between populations, eventually being fixed or extinct. 
In this context, the appearance of a new word has been 
compared to a mutation (Cavalli-Sforza and Feldman, 
1981). 

As it occurs with mutational events in standard pop- 
ulation genetics, new words or sounds can disappear, 
randomly fluctuate or get fixed. In this context, the 
idea that words, grammatical constructions or sounds 
can spread through a given population was originally 
formulated by William Wang. It was proposed in order 
to explain how lexical diffusion (i. e. the spread across 
the lexicon) occurs (Wang 1969). Such process requires 
the diffusion of the innovation from speaker to speaker 
(Wang and Minett 2005). 

2.1. Logistic spreading 

A very first modeling approximation to lexical diffusion 
in populations should account for the spread of words 



Prepared using rsifpublic.cis 



The ecophysics of language R. V. Sole et al. 



3 



1.0 




Figure 1. (a) Bifurcations in word learning dynamics: using a simple model of epidemic spreading of words, two different 
regimes are present. If the rate of word learning exceeds one (i. e. Ri> 1), a stable fraction of the population will use it. If 
not, then a well-defined threshold is found (a phase transition) leading to word extinction. The inset shows an example of the 

logistic (S-shapcd) growth curve for Ri — 1.5 and Xi{0) — 0.01. Lexical diffusion also occurs in so called naming games among 
artificial agents (b) where words are generated, communicated and eventually shared by artificial, embodied agents such as 
robots (picture courtesy of Luc Steels, SONY Labs). As common words get shared, a common vocabulary is generated and 
eventually stabilized. The dynamics of these exchanges also follows an S-shaped pattern. 



as a consequence of learning processes (Shen 1997; 
Wang et al., 2004; Wang and Minett 2005). Such model 
should be able to establish the conditions favouring 
word fixation. As a first approximation, lot us assume 
that each item is incorporated independently (Shen, 
1997; Nowak et al., 1999). If Xj indicates the fraction 
of the population knowing the word Wi, the population 
dynamics of such word reads: 



lit 



RiXi{l 



(1) 



with 1 = 1, ...,n. The first term in the right-hand side 
of the previous equation introduces the way words are 
learned. The second deals with deaths of individuals at 
a fixed rate (here normalized to one) . The way words are 
learned involve a nonlinear term where the interactions 
between those individuals knowing Wi (a fraction Xi) 
and those ignoring it (a fraction 1 — Xi) are present. 
The parameter Ri introduces the rate at which learning 
takes place. 

Two possible equilibrium points are allowed, 
obtained from dxi/dt = 0. The first is x* =0 and the 
second: 

^: = i-^- (2) 

The first corresponds to the extinction of Wi (or its 
inability to propagate) whereas the second involves a 
stable population knowing Wi. The stability of these 
fixed points is determined by the sign of 



dxj 
dx. 



(3) 



If A(a;*) <0 the point is stable and will be unstable 
otherwise (Kaplan and Glass, 1995; Strogatz 2001). 

The larger the value of Ri, the higher the number of 
individuals using the word. We can see that for a word 



to be maintained in the population lexicon, we require 
the following inequality to be fulfilled: 



Ri > 1. 



(4) 



This means that there is a threshold in the rate of word 
propagation to sustain a stable population. By display- 
ing the stable population x* against Ri (figure la) we 
observe a well-defined phase transition phenomenon: 
a sharp change occurs at i?^ = 1, the critical point 
separating the two possible phases. The subcritical 
phase Ri<\ will inevitably lead to the loss of the word. 

The dynamical pattern displayed by a succesful 
propagating word follows a so called 5— shaped curve 
(see (Niyogi, 2006) and references therein concerning 
the gradualness and abruptness of linguistic change). 
This can be easily seen by integrating the previous 
model. Let us first note that the original equation (1) 
can be re-writen as a logistic one, namely: 



dxi 
~dt 



= (Ri - l)xi 1 



(5) 



which, for an initial condition Xj(0) at t = 0, gives a 
solution 



Xi{0)e 



(fl.-i)t 



-Fa;i(0)(e(«*-i)*-l)' 



(6) 



This curve is known to increase exponentially at low 

population values, describing a scenario where words 
rapidly propagate, followed by a slow down as the 
number of potential learners decays. The accelerated, 
exponential growth has been dubbed the snowball effect 
(Wang and Minett, 2005) and such curves have been 
fitted to available data (Wang 1969). Therefore, a 
central property of linguistic change, namely its grad- 
ualness, can be derived as an epiphenomenon from the 
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dynamical patterns of successful propagation in the case 
of lexical diffusion. A further issue would to explore 
whether the gradualness of grammatical (phonological, 
morphological and syntactical) change can be derived 
from equations similar to those that model the diffusion 
of words. It must be noted, from a different perspective, 
that the logistic trajectory of linguistic change may 
be favored by "the underlying dynamics of individual 
learners", as argued by Niyogi (Niyogi 2006, p. 167). 

The previous toy model of word dynamics within 
populations is an oversimplification, but it illustrates 
fairly well a key aspect of language dynamics, which is 
also observed in ecology (Sole and Bascompte, 2006): 
thresholds exist and play a role (Nowak and Krakauer, 
1999). They remind us that, beyond the gradual nature 
of change that we perceive through our lives (mainly 
affecting the lexicon) sudden changes are also likely 
to occur. An important aspect not taken explicitely 
into account by the previous model is the process of 
word generation and modification. Words are originated 
within populations through different types of processes. 
They become also incorporated by invasion from foreign 
languages. Once again, the processes of word invasion 
and origination recapitulate somehow the mechanisms 
of change in biological populations. 

2.2. Multidimensional diffusion 

Several modifications and extensions of the previous 
model have been suggested (Wang et al., 2004). They 
include considering multiple words involved in the diffu- 
sion process. This scenario would take into account the 
idea that words interact among them in multiple ways, 
and their diffusion can be constrained or enhanced by 
these interactions (Wang and Minett 2005). The result- 
ing model describes the dynamics of a given novelty Xi 
and its previous form yi (these can correspond to two 
word or sounds) . Assuming conservation of their relative 
abundances, i. e. Xi + yi = l, it is posible to show that 
a set of equations 



dxi 
Itt 



(1 - Xi) 



N 



OLijX- 



(7) 



with i, j = 1, N, describes the lexical diffusion pro- 
cess. The matrix elements aij introduce the coupling 
rate between a pair (i, j) of words. It is interpreted as 
the rate at which adoption of the new word i is induced 
by the frequency of other novel forms of word j . As it is 
formulated, the stable states are all given by x* = 1 and 
thus (not surprisingly) there is no place for extinction, 
although there exists some evidence for such scenario, 
where new items spread initially but eventually decay 
(Ogura, 1993). An interesting extension of this problem 
could take into account both positive and negative 
interactions. In this way, not only facilitation (as given 
by the positive interactions) but also competition would 
be considered. In other words, it seems reasonable to 
think that some words should be incompatible with 
others. This actually matches the problem of species 
invasion and assembly in multispecies communities 
(Levins 1968; Case 1990, 1991; Sole et al., 2002). For an 



exotic species invading a given community to succeed, 
some community- level constrains need to be satisfied. 
It would be interesting to see if similar rules apply to 
the ups and downs of word spreading. 

As in the previous subsection, it seems fair to us 
to pose the question of whether or not grammatical 
change can be modelled using equations similar to those 
explored in the study of lexical diffusion. As to multi- 
dimensional diffusion, it may be worth considering in 
future research whether the diffusion of a grammatical 
object such as a morphological paradigm or a syntactic 
structure can be described with an equation analogue 
to eq. (7). It is also worth noting the existence of 
implicational universals (Greenberg, 1963), which have 
the shape given a grammatical property x in a language 
L, we always find a property y in L, as well as the 
crosslinguistic observation that certain properties tend 
to entail other properties ivith overwhelmingly greaier 
than chance frequency, to put it in Greenberg's famous 
words. That is, crosslinguistic grammatical change can- 
not be perfectly mapped into a pure diffusion process: 
certain properties entail or tend to entail the presence 
or absence of certain properties, as different words may 
positively or negatively interact. 

2.3. Naming games 

A related problem which also involves the generation 
and spread of words is the so called naming game. 
The original formulation and implementation of this 
problem was proposed by Luc Steels as a model for 
the emergence of a shared vocabulary within a popu- 
lation of agents (Steels, 2001, 2003, 2005; see also Nolfi 
and Mirolli, 2010). Originally, this approach involved 
communication between two embodied communicating 
agents. These agents (figure lb) are able to visually 
identify objects from their environment, assign them 
to randomly generated names which are then sent to 
the other agent in a speaker-hearer kind of interaction. 
Exchanges receive a payoff everytime the same word 
is used by both agents to name a given object. This is 
done by means of a trial and error process where failures 
are common at the beginning, as a common emergent 
lexicon slowly emerges. Specifically, the set of rules are: 

1. The speaker selects an object. 

2. The speaker chooses a word describing the object 
from its inventory of word-object pairs. If it doesn't 
have a word then it invents one for the object. 
The speaker transmits the word-object pair to the 
listener. 

3. If the listener has the word-object pair then the 
transmission is a success. Both agents remove all 
other words describing the object from their inven- 
tory and keep only the single common word. 

4. If the listener does not have the word-object pair, 
then the listener will add this new word to its 
inventory. And this is recorded as a failure. 

Eventually, a shared, stable repertoire gets fixed. The 
basic rules can be easily mapped into a toy model (the 
naming game model) involving many agents, by using 
a statistical physics approach (Baroncliclli et al., 2006, 
2008). Both hardware and simulated implementations 
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Figure 2. The dynamics of language death. Here four differ- 
ent cases are represented: (a) Scottish Gaelic, (b) Quechua 
in Huanuco, Peru, (c) Welsh in Monmouthshire, Wales and 
(d) Welsh in all of Wales, from historical data (filled squares) 
and a single modern census (open circles). Fitted curves 
show solutions of the Abrams-Strogatz model (schematically 
indicated in the upper plot). Redrawn from Abrams and 
Strogatz, 2003. 



display an S-shaped growth of the vocabulary, although 
interesting differences arise when we take into account 
spatial effects and the pattern of relations between 
agents, describable as a complex network (Steels and 
Mclntyre 2003; Dall'Asta et al., 2006; Lu et al. 2008; 
Liu et al., 2009). 

3. Competition and extinction 

Languages are spoken by individuals, and the number 
of speakers provides a measure of language breadth. 
Because of both economic and social factors, a given 
language can become more efficient than others in 
recruiting new users and as a consequence it can 
reach a larger fraction or even exclude the second 
language, which gets extinci[^ This replacement would 
be a consequence of competition, one of the most 
essential components of ecological dynamics, which can 
be applied to language dynamics too. Early models of 
two-species competition define the basic formal scenario 
where species interactions under limited resources occur 
(Case, 2000). The standard model is provided by the 
classical Lotka-Volterra equations, namely: 



dx 



(8) 



t Species and languages also get extinct under external events 
(such as asteroid impacts or climate change). Sudden death of 
a language can occur due to a volcanic eruption killing the 
small population of speakers or (more often) as a consequence 
of genocide (Nettle and Romaine 2002) 



dt 



■ Ai22/(1 ~V~ P21X), 



(9) 



where x and y indicate the (normalized) populations of 
competing species, /i^ indicate their (per capita) growth 
rates and the coefficients jiij are the rates of interspecific 
competion. We can see that for /3y = two independent 
logistic equations would be obtained, whereas for non- 
zero competition two possible scenarios are at work. 

Understanding language competition dynamics is 
clearly important; if the exclusion scenario is also at 
work, then competition can imply extinction. Moreover, 
theoretical models can help in defining useful strategies 
for language preservation and revitalization (Fishman, 
199f ; 200f ). Steady language decline has been observed 
in some cases, when population records of speakers are 
available. This is illustrated in figure 2, where the decay 
over time of four different languages is depicted. All 
these languages were used by a minority of speakers, 
competing with a dominant tongue that was gradually 
adopted by speakers as the less used ones were aban- 
doned. This type of increasing return is common in 
economics, where positive feedbacks and amplification 
phenomena are common (Arthur, f994). 

A simple model was proposed by Abrams and Stro- 
gatz, which has been shown to provide a rationale for 
the shape of language decay (Abrams and Strogatz, 
2003; Stauffer et al., 2007). The model is based on 
the assumption that two languages are competing for 
a given population of potential speakers (the limiting 
resource) where we will indicate as x and y the relative 
frequency of each population (assuming that individuals 
are monolinguals, see below). The dynamics is governed 
by the following differential equation: 



dx 



= yPa.s [y^x\- xPa.s [x ^ y], 



(10) 



where it is assumed that Pa,s[^ y\=^ if a; = and 
also constant population {x + y=l). The transition 
probabilities depend on two parameters. The specific 
model reads: 



dx 
'dt 



s{l - x)x° - {1 - s)x{l - xY 



(11) 



where the s parameter indicates the so called social 
status of the language. Two extreme equilibrium states 
are easily found after imposing dx/dt = 0. These are 
a;* = (zero population) and x* = 1 (all speakers use 
the language). In our case, the stability criterion gives 
A(0) = s — 1 < and A(l) = — s < and thus both are 
stable attractors. 

Together with the exclusion points x = and x — I, 
there is a third equilibrium point, which can be obtained 
from: 

sx*"'^ = {l-x*r-\l-s), (12) 
and, after some algebra one finds that: 



(13) 



1 



Given the stable character of the other two fixed points, 
X* can only be unstable and thus no coexistence is 
allowed. 
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Figure 3. Dynamics of language use under the presence 
of bilingual speakers. Here three types of speakers arc 
considered (a). In (b) we show the fraction of speakers vs. 
time in Galicia (North Western Spain). The smooth curves 
(modified after Mira and Paredes, 2005) are the result of 
fitting a modified Abrams-Strogatz model (see text). 



The model lias been used to fit available data on lan- 
guage decay (figure 2) and assumes a scenario of minor- 
ity languages competing with widely used, majority 
tongues. One clear implication of the stability analysis 
is that the extinction of one of the competing solutions 
is inevitable. The social parameter will influence which 
language will get extinct. Nonetheless, linguistic diversi- 
fication seems unavoidable: the language that succeeds 
in the competition situation will become more and more 
diverse as it extends through time and space, and it 
may end up yielding mutually unintelligible linguistic 
variants. 

The AS model does not take into account that a 
fraction of individuals is likely (under some circum- 
stances) to become bilingual. It might seem a not so 
relevant item, but bilingualism actually introduces a 
very interesting ingredient to our view of language 
change, to be outlined in the next section. 

4. Coexistence and bilingualism 

The previous model is simplified in many respects. 
By considering human populations as homogeneous 
systems, geographical effects and some idiosyncracies 
of human language (not shared with ecosystems) are 
ignored. Spatial effects will be explored in the next 
section. Here we concentrate on a special property of 
human communities, namely the presence of individuals 
who are grammatically and communicatively compe- 
tent on more than one language. Actually, a large 
fraction of humankind uses more than one tongue for 
communication. Historical reasons and the infiucncc 
of modern invasions by languages like English makes 



multilingualism an important ingredient to take into 
account. 

The Abrams-Strogatz model can be easily expanded 

(figure 3a) by assuming that two languages arc present 
but bilingual speakers are also allowed (Mira and Pare- 
des, 2005; Castello et al., 2006; see also Minett and 
Wang 2008). The basic idea behind this approach is 
that the presence of bilingual speakers makes language 
coexistence likely to occur, provided that the two lan- 
guages are close enough to each other. In this picture, 
three variables are used: as in the AS model, x and y 
will be the fraction of speakers using languages X and 
Y . Moreover, a third group B using both languages has 
a size h in such a way that x -\- y -\-h=\. Transitions 
arc defined in similar ways (figure 3a). For example, 
changes in x would result from a kinetic equation: 



dx 



= yP\y ^ a;] + 6P[6 -^x\-x {P\x ^ y] + P\x 6]) , 

(14) 

and the constant population constraint allows defining 
the model in terms of just two coupled equations, 
namely: 



dx 
'dt 

dy 
dt 



: c ((1 - x){l - k)s,{1 - yr - x{l - s,)(l - xT) ; 

(15) 

: c ((1 - y){l - k)(1 - s,)(l - xT - ys.{l - yT) , 

(16) 

where n G [0, 1] is a new parameter measuring the 
degree of similarity among languages and the language 
status are now indicated as Sx and Sy — 1 — s^,, respec- 
tively. The K parameter provides a measure of the 
likelihood that two single-language speakers can com- 
municate with each other. It also affects the probability 
that a monolingual speakers becomes bilingual. We can 
easily check that the model reduces to the AS scenario 
for K = 6 = 0. 

Available data from language change in Northern 
Spain (Mira and Paredes, 2005) provide a test of this 
model. Here the two languages are Castilian and Gali- 
cian, both derived from Latin. These languages allow a 
relatively good mutual understanding and parameters 
are easily estimated. For this data set, a best fit was 
obtained using a = 1.5, s{Galician) = 0.26, c ~ 0.1 and 
K = 0.8. As we can see, the apparent decline of Galician 
is actually a consequence of a simultaneous increase of 
Castilian monolinguals and bilinguals. 

We should be aware of the overestimation of the 
role of the k parameter as a measure of the probability 
that a monolingual speaker becomes bilingual, since k 
is only an indicator of the degree of similarity among 
languages, and neglects the role of their social status. 
It is worth noting that many bilingual scenarios involve 
two highly differentiated languages, such as Basque and 
Castilian in northern Spain or Amazigh and Arabic in 
northern Africa. 

How likely is the bilingual scenario to be relevant 
in the future? Recent model approaches suggest that 
maintaining a bilingual society necessarily requires the 
maintenance of status as a control parameter (Chapel 
et al. 2010). On the one hand, preserving language 
diversity in a globalized world will need active efforts 
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when small populations of speakers are involved. But on 
the other hand, we must also take into account current 
demographic trends (Graddol, 2004) which will need to 
be incorporated into future models of language change. 
Against early predictions suggesting the dominant role 
of English as an exclusive language, the future looks 
multilingual. Different languages are gaining relevance 
as their social and economic status improves. More- 
over, other interesting tendencies start to develop as 
some languages (such as English, Portuguese or Dutch) 
spread beyond their original geographic domains. They 
not only become mutualistic (as a bilingual speaker 
acquires a higher social status) but can also develop 
internal differentiation. We should expect in the future 
to see the emergence of (perhaps uintelligible) dialects 
of English, as it happened with Latin. 

5. Spatial dynamics 

The exclusion point resulting from the Lotka-Volterra 
equation and related models (such as Abrams- 
Strogatz's model) implies that strong competition leads 
to diversity reduction. Within the context of popula- 
tion dynamics, such result was challenged under the 
introduction of spatial degrees of freedom (Sole et al., 
1993; see also Sole and Bascompte, 2007 for a review 
of results). Spatial dynamics involves two basic compo- 
nents. One is the reaction term, describing how pop- 
ulations interact (for example the previous equations 
described above). The second describes how populations 
move through space. It is well known that space is 
responsible for the emergence of qualitative changes 
in dynamical patterns (Turing, 1952; Bascompte and 
Sole, 2000; Dieckmann et al., 2000). Competition under 
spatial structure generates a completely novel result: 
since exclusion depends on initial conditions, the two 
potential attractors can be (locally) possible. Starting 
from random initial conditions, different species or 
languages can exclude each other at different locations. 

The extension of the Abrams-Strogatz model to 
space was performed by Patriarca and Leppanen (2004) 
who used a reaction-diffusion framework. The model 
considers the local dynamics of the normalized densities 
of speakers using a given language at each point r in 
space. If 0a;(r, t) and (j)y{r, t) indicate the local densities 
of X and y at a given point and time, they read: 



dt 

dcj^yjr, t) 
dt 



F{<^^,(j>y) + D^V''(j>^{r,t) (17) 
= -F{cj>,,4>y)+DyV^y{v,t), (18) 



where F{(f)x, 4>y) is just the AS equation for the local 
densities: 



(19) 



and Sx, Sy indicate the status of each language. The Di^s 
on the right side of the equation are the so called dif- 
fusion coefficients associated to the spreading process. 
The previous equations can be numerically integrated 
(Dieckmann et al., 2000). We will illustrate this by using 
a one-dimensional spatial system (the generalization to 
two dimensions is straightforward). First, we discretize 
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Figure 4. Spatial segregation of languages over time. Here 
we use the discretized equations of two competing languages 
in order to calculate their population of speakers (relative 

frequency) over time. Wc start in (a) from two segregated 
populations of speakers, each in a different domain and 
having a Gaussian shape, with N^iO) = Ny{0) = 1/2, a = 1.3 
and status parameters fixed to Sx = ^ — Sy = 0.55. As we 
can see (see text) although locally there is exclusion of one 
language, globally both languages coexist. As time proceeds 
(b-c) the spatial distribution converges to a homogeneous 
state where each language survives in each domain. Here 
t{b) = 10^ and tc = W^. 



d(j)/dt as follows: 

d(j>x{r,t) ^{r,t + At)-(l){r,t) 



dt 



At 



(20) 



where r is the local position in the one-dimensional 
domain Z = [0, L] and At some characteristic time 
scale. Similarly, the discretization of the diffusion term 
is made as follows: 



9^0^ (r, t) <f){r + Ar, t) + (j){r - Ar, t) - 2^(r, t) 



Qj.2 



Ar2 



(21) 



being Ar the corresponding characteristic spatial scale. 
Using these definitions, we obtain an equation for the 
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time evolution of (pxif, t): 
(l)x{r,t + At) = Mr,t) + 

F{4>x, (t>v) + i^ijpij + Ar, t)+ 



D 



Ar2 

+(/)(r - Ar, t) - 2(j){r, t))] At. 



Additionally, boundary conditions need to be 
included. These allow defining the impact of finite size 
effects and geography on the dynamics and equilibrium 
states. The reasonable assumption is to use zero- flux 
(von Neumann) boundary conditions, namely 



Mm) 

dr 



t>xir, t) 
dr 



0. 



(22) 



In terms of our discretization, we would have c6:r(0, t) — 
MAr, t)=0 and t) - - Ar, t) = 0. 

The dynamics starts with two populations of speak- 
ers located in two different domains and Zy (so 
that Z = ZyU Zy). This is shown in figure 4a, where 
we display the initial condition. If we label as iV^ and 
the total populations of speakers in each domain 
/U = 1, 2, at a given domain Z^ we would have: 



Kit) 



■■ / 4)t{r,t)dr, 



(23) 



starting from Ni = 1/2 following a Gaussian shape (see 
Patriarca and Leppanen, 2004). As the dynamics pro- 
ceeds, we can observe a tendency towards maintaining 
the spatial scggrcgation. Each language "wins" in its 
initial domain, and eventually both reach a homoge- 
neous steady state within such domain. Generalizations 
to heterogeneous domains reveal that the previous 
patterns can be affected by both historical events and 
spatial inhomogeneities (Patriarca and Heinsalu, 2008). 
However, the main message from this approach is robust 
and completely related to models of competing popula- 
tions in ecology (Sole et al., 1993; Sole and Bascompte 
2006). In summary, this tells us that the effects of 
spatial degrees of freedom on language dynamics have 
a great impact on the coexistence versus extinction 
scenarios. 

Space slows down the effects of competitive inter- 
actions, effectively reducing competition at the local 
scale. Moreover, the role of diffusion (dispersal) on 
competition dynamics allows to create well-defined 
domains where given languages or species have replaced 
others. In this context, it is clear that the increasing 
connectivity of our world due to globalization has made 
easier to reduce the potential impact of geography in 
the propagation of languages or epidemics (Buchanan, 
2003). Although we do live in a two-dimensional surface, 
the world has certainly changed and spatial constraints 
have been strongly reduced. 

6. String models of language change 

As already mentioned in section 2, a collection of words 
provides a first definition of a language in terms of 

its lexicon. This of course ignores a crucial component 
of language: words interact in non-random ways and 




Figure 5. String language model. Here a given set of elements 
defines a language. Each (possible) language is defined by a 
string of v bits (here L = 3) and thus 2^ possible languages 
are present in the hypercube. The two types of elements are 
indicated as filled (1) and empty (0) circles, respectively. 



higher-order levels of organization should be taken into 
account. However, as it occurs with some theoretical 
models of diverse ecosystems (Sole and Bascompte, 
2006) some relevant problems such as diversity and 
its maintenance can be properly addressed by ignoring 
interactions. Following this picture, we consider in this 
section the lexical component of language viewed as a 
bag of words and how a set of languages competing 
for a given population of speakers can evolve towards 
a single, dominant tongue or instead a diverse set of 
coexisting languages. 

A fruitful toy model of language change is provided 
by the string approximation (Stauffer et al., 2006; 
Zanette, 2008). In this approach, each language £j is 
treated as a binary string, i. e. Ci = {S{, S2, S\^) of 
length L. Here Sj £ {0, 1} and, as defined, a finite but 
very large set of potential languages exists. Specifically, 
a set of languages £ is defined, namely 



C — {£1, ■^m}, 



(24) 



with M = 2^. These languages can be located as the 
vertices of a hypercube, as shown in figure 5 for L = 3. 
Nodes (languages) are linked through arrows (in both 
directions) indicating that two connected languages 
differ in a single bit. This is a very small sized system. 
As L increases, a combinatorial explosion of potential 
strings takes place. 

6.1. Mean field model 

A given language Ci is shared by a population of 
speakers, to be indicated as Xi, and such that the 
total population of speakers using any language is 
normalized (i. e. — mean field model for 

this class of description has been presented by Damian 
Zanette, using a number of simplifications that allow 
understanding the qualitative behavior of competing 
and mutating languages (Zanette, 2008). A few basic 
assumptions are made in order to construct the model. 
First, a simple fitness function (j){x) is defined. This 
function measures the likelihood of abandoning a lan- 
guage. This is a decreasing function of x, and such that 
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^(0) = 1 and 0(1) = 0. Different cfioices are possible, 
including for example 1 — x, 1 — or (1 — a;)^. On 
the other hand, mutations are also included: a given 
language can change if individuals modify some of their 
bits. 

The mean field model considers the time evolution 

of populations assuming no spatial interactions. If wc 
indicate x = {xi, Xm), the basic equations will be 
described in terms of two components: 

^=A,(x)-M,(x), (25) 

where both language abandonment Ai (x) and mutation 
Mi (x) are introduced. Specifically, the following choices 
are made: 

Ai{^) = pxi {{<!))- (I){xi)) , (26) 

for the population dynamics of change due to abandon- 
ment. This is a replicator equation, where the speed 
of growth is defined by the difference between average 
fitness {(f)), namely 

TV 

(<^)=^0(x,)x,-, (27) 

and the actual fitness 4'{xi) of the i-language. Here 
p is the recruitment rate (assumed to be equal in all 
languages). What this fitness function introduces is 
a multiplicative effect: the more speakers that use a 
given language, the more likely that they keep using 
it and others join the same group. Conversely, if a given 
language is rare, its speakers might easily shift to some 
other, more common language. 

The second term includes all possible flows between 
"neighboring" languages. It is defined as: 

(N N \ 

Y^Wi^x^ - XiY^wA . (28) 
i=i i=i / 

In this sum, wc introduce the transition rates Wij of 
mutating from language Ci to language Cj and vice 
versa. Only single mutations are allowed, and thus 
Wij = 1 if the Hamming distance D{£i, Cj) is exactly 
1. More precisely, if 

L 

D{Ci,Cj)=Y,\Sl-Si\=l. (29) 
fc=i 

In other words, only nearest-neighbor movements 
through the hypcrcubc arc allowed. In summary, 
A(x) provides a description of competitive interactions 
whereas M(x) gives the contribution of small changes 
in the string composition. The background "mutation" 
rate p. is weighted by the matrix coefficients Wij associ- 
ated with the likelihood of each specific change to occur. 

This model is a general description of the bit string 
approximation to language dynamics. However, the 
general solution cannot be found and wc need to analyse 
simpler cases. An example is provided in the next 
section. Although the assumptions are rather strong, 
numerical models with more relaxed assumptions seem 
to coiifirni the basic results reported below. 



6.2. Supersymmetric scenario 

A solvable limit case with obvious interest to our 
discussion considers a population where a single lan- 
guage has a population x whereas all others have a 
small, identical size, i. e. .Xj = (1 — .t)/(7V — 1). The 
main objective of defining such supersymmetric model is 
making the previous system of equations collapse into a 
single differential equation, which we can then analyze. 
In particular, we want to determine when the x = Q 
state will be observed, meaning that no single dominant 
language is stable. 

Since we have the normalization condition, now 
defined by: 

N M-1 . . 

E-.=-+E^ =1 (30) 

(where we choose x to be the M-th population, without 
loss of generality). In this case the average fitness reads: 

(<A)=0(x)x + ^V(i^)(l^). (31) 

Using the special linear case (i>{x) = 1 — x, we obtain: 

A{x)=px{l-x){x-lf^^. (32) 

The second term is easy to obtain: since x has (as any 
other language) exactly L nearest neighbors, and given 
the symmetry of our system, we have: 

And the final equation for x is thus, for the large-A'' 
limit (i. e. when A'' ^ 1): 

'^^=px'{l-x)- pix. (34) 

This equation describes an interesting scenario where 

growth is not logistic, as it happened with our previous 
model of word propagation. As we can see, the first term 
in the right-hand side involves a quadratic component, 
indicating a self-reinforcing phenomenon. This type of 
model is typical of systems exhibiting cooperative inter- 
actions and an important characteristic is its hypcirbolic 
dynamics: instead of an exponential-like approximation 
to the equilibrium state, a very fast approach takes 
place. 

The model has three equilibrium points: (a) the 
extinction state, a;* = where the large language dis- 
appears; (b) two fixed points x^ defined as: 

As wc can see, these two fixed points exist provided that 
p < Pc = pI'^- Since three fixed points coexist in this 
domain of parameter space, and the trivial one [x* = 0) 
is stable, the other two points, namely x*_ and x*^, must 
be unstable and stable, respectively. If /x < pc-, the upper 
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Figure 6. Phase transitions as bifurcations in Zanette's mean field model of supersymmetric language competition. In (a) 
wo show the bifurcation diagram using fi/p as control parameter. Once we cross the critical point, a sharp transition occurs 
from monolanguage to language diversity. This transition can be visualized using the potential function <I>(a;) whose minima 
correspond to possible equilibrium points. Here we use p = 1 with (b) n = 0.1, (c) jj, = 0.2 and (d) jj, = 0.3. In (e) we also 
plot the phase diagram using the (p, fi) parameter space. 



branch a;^ corresponding to a monolingual solution, is 
stable. 

In figure 3a we illustrate these results by means of 

the bifurcation diagram using p = I and different values 
of jj,. In terms of the potential function we have: 



dx _ d^fj,{x) 
dt dx ' 



(36) 



where ^ix{x) = — J {A{x) — B{x))dx, which for our sys- 
tem reads: 



(37) 



In fig 5a-d three examples of this potential are shown, 
where we can sec that the location of the equilibrium 
point is shifted from the monolanguage state to the 
diverse state as /i is tuned. The corresponding phases 
in the (p, fi) parameter space are shown in figure 5. 

It is interesting to see that this model and its phase 
transition is somewhat connected to the error threshold 
problem associated to the dynamics of RNA viruses 
(Domingo et al. 1995; Eigen et al., 1987). For a single 
language to maintain its dominant position, it must 
be efficient in recruiting and keeping speakers. But 
it also needs to keep heterogeneity (resulting from 
"mutations") at a reasonable low level. If changes go 
beyond a given threshold, there is a runaway effect that 
eventually pushes the system into a variety of coexisting 
sub-languages. An error threshold is thus at work, but 
in this case the transition is of first order. This result 
would indicate that, provided that a source of change is 
active and beyond threshold, the emergence of multiple 
uninteligible tongues should be expected. 

String models of this type only capture one layer of 
word complexity. Perhaps future models will consider 
ways of introducing further internal layers of organi- 
zation described in terms of superstrings. Such super- 
string models should be able to introduce semantics. 



phonology and other key features that are known to 
be relevant. An example in this direction is provided by 
models of the emergence of linguistic categories (Puglisi 
et al., 2008). 

7. Global patterns and scaling laws 

Tracking the relative importance of languages and in 
particular their likelihood of getting extinct requires 
having the appropriate censuses of number of speakers 
using each language. The statistical patterns displayed 
by languages in their spatial and demographic dimen- 
sions provide further clues for the presence of non- 
trivial links between language and ecology (Nettle 1998; 
Pagel and Mace, 2004; Pagel 2009). These patterns also 
provide a large-scale picture of languages, not restricted 
to small geographical domains or countries. In this 
section we consider two of such statistical patterns. It is 
important to notice that, strictly speaking, this problem 
involves both ecological and evolutionary time scales. In 
a given c^cosystcmi. the succession process leading to a 
mature, diverse community can be described in terms of 
ecological dynamics. At this level, invasion and network 
species interactions are both relevant. However, the 
composition of the local pool of species is the outcome 
of evolutionary dynamics. 

Some spatial models of language change have been 
presented in order to explain the results shown below 
(see de Oliveira et al. 2005; de Oliveira et al, 2008). The 
close correlation between species diversity and language 
richness, as reported by different studies (Mace and 
Pagel, 1995; Moore at al., 2002; Gaston 2005) suggests 
that some rules of organization might be common. As 
an example, a large scale study of correlations among 
biological species and cultural and linguistic diversity 
in Africa (Moore et al., 2002) revealed that one third 
of language richness can be explained on the basis of 
environmental factors. These included rainfall and pro- 
ductivity, which were shown to affect the distributions 
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Figure 7. Scaling law in the distribution of language diversity 
/3 as a function of area. The best fit to the power law D ~ 
is shown. Redrawn from Gomes et al., 1999. 

of both species and languages. However, there are also 
important differences that need an explanation. 

7.1. Species- area relations 

One of the universal laws of ecological organization is 
the so called species-area relation (Rosenzweig, 1995). 
It establishes that the diversity D (measured as the 
number of different species) in a given area A follows 
a power law 

(38) 

where the exponent z tipically varies from z = 0.1 to 
z = 0.45. Interestingly, languages seem to follow similar 
trends. They exhibit an enormous diversity, strongly 
tied to geographical constraints. As it occurs with 
species distributions, languages and their evolution are 
shaped by the presence of physical barriers, population 
sizes and contingencies of many kinds. In this context, 
differences are also clear: speciation in ecosystems can 
take place without the presence of physical barriers, 
whereas some type of population isolation seems nec- 
essary for one language to yield two diferent languages, 
i.e., two linguistic variants that arc not fully intcrin- 
telligible. On the other hand, there is a continuous 
drift in both species and languages that makes them 
change. A second difference involves the way extinction 
occurs. Species get extinct once the last of its members 
is gone. Languages get extinct too once they are not 
used anymore, even if its native speakers are still alive 
(Dalby, 2005). 

Studies of geographical patterns of language distribu- 
tion reveal complex phenomena at multiple scales. As an 
example, it was shown that they also display a diversity- 
area scaling law, with z = 0.41 ± 0.03 (Gomes et al., 
1999) . In figure 7 we show the results of this analysis for 
a compilation listing more than 6700 languages spoken 
in 228 countries. The power law fit is very good and 
spans over almost six decades (with a deviation for areas 
smaller than 30Km^) (Gomes et al., 1999). Similar 
results arc obtained by using population size N instead 
of areas. In this case, it was shown that the new power 



law reads: 

Dr^N" (39) 

with u = 0.50 ± 0.04. However, a close inspection of 
data reveals the impact of other forces acting on lan- 
guage diversity. An example is the contrast between 
Europe and New Guinea (see Diamond, 1997 and ref- 
erences therein). The former has 10^ Km^ and includes 
63 languages, whereas the later, with only less than one 
tenth of Europe's surface, contains around 10"^ different 
languages. The singularity of New Guinea has been 
carefully analysed by many authors. Take for example 
Papua New Guinea, which contains just 0.1 percent 
of the world's population but more than 13 percent 
of world's languages. It is geographically an extremely 
irregular landscape, which creates multiple opportuni- 
ties for isolation. Moreover, 80 percent of its land is 
covered by rainforests. Additionally, food production is 
continuous, with no food shortages and a good yield. 
Bilingualism is widespread, with most speakers of the 
dominant Tok Pisin also speaking some local language 
too (being exposed to several). Given the high yields of 
food harvest together with biogeographical constraints, 
there has been little incentive to create large-scale trade. 
A consequence of such scenario is a dynamic equilibrium 
far from language homogeneization (see Nettle and 
Romaine, 2000 for a review). 

The species-area relation has been explained in a 
number of ways through models of population dynamics 
on two-dimensional domains. Beyond their differences, 
these models share the presence of stochastic dynamics 
involving multiplicative processes. In ecology, such type 
of processes are characterized by positive and negative 
demographical responses proportional to the current 
populations involved: a larger population will be more 
likely to increase, but also more likely to suffer the 
attack of a given parasite (and thus experience a rapid 
decline). Within language, the rich-gets-richer effect is 
obvious, whereas there is no equivalent for the negative 
effects of "parasitic" languages. 

7.2. Language richness laws 

A different measure of language diversity involves the 
language richness among different countries. If Af{D) 
is the frequency of countries with D diferent languages 
each, we can plot the cumulative distribution J\f>{D) 
defined as: 

Af>{D)= / M{D)dD. (40) 
Jd 

The resulting plot is rather interesting (fig 8a): the 
distribution follows a two-regime scaling behavior, i. c. 

Ny{D)r^D-^, (41) 

with /3 = 0.6 for 6 < D < 60 and /? 1.1 for 60 < £> < 
700. What is revealed from this plot? The first domain 
has an associated power law with a small exponent (here 
■N'{D) ^ D~^-^): many countries have a small language 
diversity. But once we cross a given threshold D « 60 
the decay becomes faster. One possible interpretation 
is that countries having a very large diversity will have 
harder times to preserve their unity under the social 
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Figure 8. Scaling laws in language diversity, (a) Here we plot the cumulative distribution of languages using the number of 
countries with a language diversity greater than D. Redrawn from Gomes et al., 1999. The marked area indicates the domain 
of language-rich countries, whose distribution is steeper than the low-diversity domain, (b) Distribution of languages having 
A'^ speakers. Here the data set for languages is compared with a simulation using a specific set of parameters (see de Oliveira 
et al, 2008). Although different parameter sets give different curves, the qualitative behavior is allways the same. In (c) we 
show four snapshots of a model of language diversity dynamics on a two-dimensional lattice (adapted from de Oliveira et 
al., 2008). Here each symbol type indicates one given language, whereas its size indicates the local population allowed. As 
time proceeds, mutations arise and new languages emerge and spread (see text). 



differentiation associated to ethnic diversity (Gomes et 
al., 1999). 

A related distribution is given by the number of 
languages nL{N) with a population size of A'' speakers. 
In figure 8b we display a log-log plot of the data set 
(after binning) which shows a log-normal behavior, with 
an enhanced number of small-sized languages. This 
pattern (as well as the scaling with area) is reproduced 
by a simple model presented below. 

7.3. Language diversity model 

A simple spatial model has been proposed in (de 
Oliveira et al., 2008) as an extension of previous 
work (de Oliveira et al., 2006; see also Silva and 
Oliveira, 2008). The model combines a stochastic cellu- 
lar automaton approach with non-local rules and a bit- 
string implementation. Starting from an empty lattice 
Q. oi L X L sites. Each site (i, j) € is characterized by 
a random number 1 < Kij < M (with uniform distribu- 
tion) representing the maximum population of speakers 
achievable by the language occupying it (the carrying 
capacity). Only one language Ci can be present at a 
given site and (as in section 6) is represented by a string 
£j = {SI, 5*2, S}^) of length L. A seed £i is located 
at i = at a given site (a, 6), thus having a population 
Kab- Now dispersal to nearest neighbors in the lattice 
occurs, favouring the spread towards sites having higher 
Kij. Moreover, at a given site the given language Ck 
can change (mutate) to a new one with a probability 



— o:/ fi^k)- Here /(£&) is the fitness associated to 
Ck, here chosen as: 

/(£fc)=E^^^-^(^(*'^')'^'^) (42) 

with 0(m, n) = 1 if m = n and zero otherwise. In words, 
the fitness considers the total occupation of the lattice 
(in terms of speakers) , and the likelihood of a language 
to mutate is thus size-dependent following an inverse 
law. In this way we incorporate the well known fact 
that the impact of mutations favour genetic drift. The 
previous rules allow a diverse set of languages to expand 
and eventually occupy the whole lattice. An example 
is shown in figure 8b for a small {L = 50) lattice. 
We can see how languages emerge and spread around, 
generating monolingual patches. 

In spite of its simplicity and strong assumptions, the 
model is able to capture several qualitative properties 
of both spatial and statistical power laws, similar to 
those presented above (de Oliveira et al., 2006; 2008). 
In some sense, we can conclude that the observed 
commonalities point towards shared system-level prop- 
erties. This conclusion is partially true: the process of 
ecosystem building can be understood in terms of a 
spatial colonization of available patches. Each patch 
offers a given range of conditions that make it more or 
less suitable for the colonizer to persist. If colonization 
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occurs locally, nearest patches will be occupied by best- 
fit conipetitorf[^ In an ecological-like model, non-local 
colonization events will occur due to the introduction 
of species from the regional pool (see Sole et al, 2002) 
but these events can also be interpreted as speciations. 
Perhaps the most obvious difference with ecological 
models is the assumption of a fitness trait that involves 
the whole population of the species. Such a non-local 
effect seems reasonable to assume when thinking of 
language as a vehicle of economic influence. Larger 
communties of speakers are likely to be much more 
efficient in further expanding. 

8. Discussion 

Language dynamics has attracted the attention of 
physicists, computer scientists and theoretical biologists 
alike as a challenging problem of complexity (Gomes 
et al., 1999; Smith, 2002; Steels, 2005; Stauffer and 
Schulze 2005; Brighton et al. 2005; Baxter et al., 2006; 
Kosmidis et al., 2006; Lieberman et al., 2007; Schulze 
et al., 2008; Zanette 2008; Cattuto et al., 2009; Gong 
et al 2008). Language makes us a cooperative species 
and has been crucial to our evolutionary success. It 
pervades all aspects of human society. Its complexity is 
extraordinary and it would be easy to conclude that any 
modelling effort will end in failure. However, as it occurs 
with many other complex systems, important features 
of language structure and dynamics can be captured 
by means of simple models. The fact that we live in 
the midst of a rapid globalization process makes the 
development of such models an important task. 

In this work we have explored the application of sev- 
eral methods from nonlinear dynamics and statistical 
physics to different aspects of language dynamics. Many 
of the above described models can be interpreted also in 
the light of ecological dynamics, generally taking species 
instead of languages. In this last section we shall discuss 
the scope of such an analogy, focusing our attention on 
some basic similarities and differences between linguis- 
tics and ecology. Some of these are summarized in table 
1. Some differences are obvious. Species are embedded 
within complex ecosystems defining networks of species 
interactions (Montoya et al, 2006). Such webs are the 
architecture of ecological organization. Although one 
could define a matrix of language-language interaction 
in terms of dominance relations of some sort, the 
equivalence would be weak. Similarly, some dynamical 
processes known to play important roles in ecology 
are absent in language dynamics. A dramatic example 
is provided by the impact of small invasions of alien 
species introduced in a given ecosystem. Very often, 
the invaders expand rapidly and trigger the collapse of 
the whole community. A small group of humans using a 
foreign language would not succeed to propagate within 
a much larger community of speakers, unless a huge 
assymetry among the social status is at work. 

5ln fact two opposite strategies can be observed in nature, 
particularly when looking at the colonization of habitat by plants, 
which can invest either in a few, well-protected seeds or many, 
small ones. In the second case, most of the seeds will fail to 
survive. 



One of the most important links between languages 
and species is strongly tied to the concept of species 
and its similarity with language. As is well-known, a 
group of organisms is said to constitute a species when 
they are capable of interbreeding and they are sepa- 
rated from another group also capable of interbreeding 
with which they cannot interbreed. A community is 
said to possess a language when their members can 
communicate with each other efficiently using linguistic 
signs and they cannot communicate with a different 
community which possesses a different language. These 
two conceptions are known to be problematic: there 
is, for instance, variation in the degree of success of 
hybridization between two species and in the degree of 
mutual understanding between two languages. As for 
linguistic variants, it is not uncommon that members of 
a community A understand the linguistic variant of a 
community B better than the members of B understand 
the linguistic variant of A, and quite often the decision 
of whether two linguistic variants constitute a language 
or a dialect is not guided by the interintelligibility 
criterion but by political reasons. Therefore, the bound- 
aries among groups of organisms and among linguistic 
variants as to interbreeding and interintelligibility are 
fuzzy. Both languages and species constitute continua 
where the relative degree of interintelligibility and inter- 
breeding vary substantially depending on how close two 
languages or species are in the continuum. 

Competition is also a crucial concept to under- 
stand both ecological and language dynamics. Whereas 
species in contact may compete for limited resources, 
languages in contact may compete for the number of 
speakers. Since languages are not constituted of indi- 
viduals, but they are abstract systems (codes) shared 
by a community, it may seem that languages compete 
for the number of speakers only in a metaphorical sense. 
However, it is remarkable that the competition among 
languages and the competition among species can be 
mathematically modeled using similar methods. At this 
point, it is necessary to take into consideration the 
importance of the role of a given language as a social 
status parameter in language competition, provided 
that different languages may distribute differently in 
society, but not different species in an ecosystem. More- 
over, competition among different languages in contact 
can be materialized in many different ways, depending 
on how a given culture conceives mono/multilingualism. 

Although the ecological metaphor of language 
dynamics flts well with several important features, there 
are a number of important linguistic phenomena which 
have no equivalent in ecology. Some members of a 
community may be bilingual or multilingual, i.e., they 
may possess not only the traditional language of the 
community (namely, their mother tongue), but also 
other languages or dialects. Indeed, some members of 
a community may use different languages or dialects in 
different social spheres, a phenomenon called disglossia. 
It is also worth noting that, when speakers of multiple 
languages have to communicate and do not have the 
chance to learn each other's language, they develop 
a simplifled code, a pidgin, which may increase its 
degree of complexity over the years. However, when 
a group of children are exposed to a pidgin at the 
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Species 


Languages 


Nature 


Classes of living beings 


Community-shared codes 


Separation based on 


Lack of interbreeding 


Unintelligibility 


Origination 


Genetic/geographic isolation 


Geographic barriers 


Extinction causes 


Competition/external events 


Competition/External Events 


Abundance 


Two-regime scaling 


Scaling law 


Intermediate forms 


Subspecies 


Dialects 


Spatial distribution 


Species-area law 


Language-area scaling 


Change through time 


Gradual -|-Punctuated 


Gradual -|-punctuated 


Effects of small invasion 


Very important 


Rare 


Mutualism 


Very important 


No 


Multilingualism 


No 


Very important 


Network structure 


Yes 


No 



Table 1. A comparative list of features relating the organization and change of languages and species. The list of mechanisms 
is not exhaustive: it only considers mainstream phenomena. Some parallelisms between languages and species should be 
considered with attention. Although small invasions have deep impact in ecosystem's organization, this factor rarely has a 
remarkable effect within large linguistic communities. This is arguably related to the tendency that an invading language 
displays at the same time, a low demographic weight and a low social status. It is also interesting to observe that mutualism, 
i.e., a cooperative strategy for survival that benefits two or more species, is completely absent in language dynamics. On the 
contrary, multilingualism, as well as disglossia and related phenomena -see text- are features exclusive to language. Finally, 
we empahsize that analogies to food webs are difficult to define in the study of language contact. However, it is conceivable 
some kind of network abstraction to represent the socio-cultural relations among languages or communities of speakers. 



age when they acquire a language, they transform it 
into a full complex language, a Creole (DeGrafF, 1999, 
and references therein). In this context, although some 
parallels have been traced between creolization and 
genetic hybridization in plants (Croft 2000) they don't 
seem well supported or even properly defined. 

Another related and remarkable linguistic idiosyn- 

cracy is the emergence of new languages ex nihilo. 
This is the case of the Nicaraguan sign language (Kegl 
et al., 1999) which spontaneously developed among 
deaf school children in western Nicaragua over a short 
period of time once deaf individuals (until then growing 
essentially isolated) could start communicating to each 
other. Starting from a very limited number of signs 
and unable to learn Spanish, it was found that the 
group rapidly developed a grammar, which became a 
complex language at the second "generation", as soon 
as the next group of children learned it from the first 
one. A similar situation was analysed for the Al-Sayyid 
Bedouin Sign Language, which has arisen in the last 
70 years within an isolated community (Sandler et al., 
2005). This type of phenomena highlights the role of 
the cognitive dimension of language, which makes it 



far more flexible than species behavior. Indeed, nothing 

similar to multilinguism, diglossia or the appearance 
of new languages (pidgins and Creoles) is attested in 
non-linguistic ecological systems. Modelling such type 
of phenomena is still an open challenge. 

In sum, as suggested by Darwin, both languages 
and ecosystems share some of their crucial features. 
These would include spreading dynamics, the presence 
of dramatic thresholds or the role of space in favour- 
ing heterogeneity. In the language context, this space- 
driven enrichment can be interpreted in other ways 
than physical space, such as social distance. It is also 
true, however, that a close inspection of both systems 
reveals some no less interesting differences, particularly 
those related to the flexibility of individuals in acquiring 
several languages or the social, cultural or political 
factors that constantly interfere in linguistic phenom- 
ena. Future efforts towards a theory of language change 
might help understanding our origins as a complex, 
social species and the future of language diversity. 
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